Comments:

# Summary ### Introduction ##### about hybrid reco : - many RS have an underlying HIN structure and are achieving hybrid reco (in the sense using both user feedbacks and content-based info) ##### difference with existing methods : - use different types of relations, benefit : use the fact that users consume an item for different reasons (e.g. : movie for genre, for director, etc) ##### Recommender System : - combine users feedback and various types of info in a collaborative filtering style - use metapath in HIN to generate reco - technical implementation uses Matrix Factorization ##### Datasets : - MovieLens 100K combined with IMDB and Yelp, implicit feedback only ##### Contributions: - study reco with implicit feedback in HIN - use network heterogeneity to spread preferences on the metapaths - generate personalized reco - specific case study : ML100K and Yelp ### Background and preliminaries ##### binary user feedback - explain how to generate the bipartite adjacency matrix ##### Heterogeneous Information Network - definition (using entity mapping function and link entity mapping function) - vocabulary to describe HIN ##### Matrix Factorization for implicit feedback - describe principle of MF (decomposing the feedback matrix) - resolution using NMF ##### Problem definition - how to make personalized recommendation based on implicit feedback in the form of a list of recommendations ### Meta-path based latent features ##### meta-path - definition and interest (types of paths in a HIN) - can be used to measure similarity and proximity between entities - ex: user [watches] movie [watched by] user [following] actor [starring] movie ##### user preference diffusion - type of meta-paths considered in the paper : user -> item -> * -> item (* may be tag, genre, director, plot for ML100K ; * may be category, customer, location for Yelp) - define user preference score : normalized weighted sum of the number of paths to a given item (eq 2) - if L types of metapaths, then L matrices R (user preference matrices) - use these scores to build the recommendation model ##### global recommendation model - define the recommendation mechanic which is inspired by MF - (?) MF may be achieved on each user preference matrix taken separately : find a couple of reduced matrices with NMF, then prediction model is given by equation 4 - RK: not personalized as coefficients are the same for every user ### Personalized recommendation model - same principle as global recommendation method, except that there is first a clustering, and the learning is achieved cluster per cluster - the number of clusters is a parameter of the method ### Model learning with implicit feedback - learning the model parameters (thetas in equation 4) - use implicit feedback to do so (1 = user browses item / 0 = user does not) - usually prediction done with either classification or learning-to-rank but their approach: rank 1s above 0s (in the spirit of ref 21) ##### Bayesian ranking-based optimization - assumption: a user ranking is independent from the others (allow to get eq.7) - assumption on the probability expressed in equation 8 - allows to derive the expression of objective function O ##### optimization algorithm - optimization: finds thetas such that dO/dTheta = 0 - method: Stochastic Gradient Descent ##### learning personalized recommendation models - this technique is not personalized - to personalize the reco: clusters with a k-means method ### Empirical study ##### Data - dataset 1 : IMDB + ML100K (IM100K); if user has seen movie 1 else 0 - dataset 2 : Yelp; if user has reviewed buisness 1 else 0 - d2 much sparser than d1 (see feedback distribs on figure 5) - temporal split 80% / 20% between training and test ##### Competitors and evaluation metrics - RS benchmarks: popularity-based, co-click, NMF (baseline of collaborative filtering), hybrid SVM - for their method: 10 different metapaths différentes (see Table 6.2) - evaluation: as based on implicit feedback, precision at position and top-10 mean reciprocal rank (MRR_k) ##### Performance comparison - Table 3 for a summary - very few items interact with a lot of users - parameters for NMF: dimension of the reduced matrix: 20 (IM100K), 60 (Yelp) - Hybrid-SVM uses the same info as their method (HeteRec) and uses PathSim - in general HeteRec better than all benchmark methods - in particular HeteRec > Hybrid-SVM (while similar information) - improvement higher for Yelp than for IM100K, possibly a consequence of Yelp sparsity - HeteRec-p (personalized version) : even better than HeteRec-g ##### Performance analysis - more precise analysis of the performances on IM100K only for HeteRec-g , HeteRec-p , NMF , Co-Click - divide in 6 different training sets, depending on various parameters - performances increase with the number of movies watched for all methods except co-click - performances decrease with movies popularity for all methods except co-click ##### Parameter tuning - HeteRec have more parameters - regularization parameter lambda (eq 9) computed with cross-validation - sampling necessary for Yelp (as 10^12 elements), performance variations with sampling represented on Fig7 (relatively stable) - for HeteRec-p: number of clusters, see fgure 6c ### Related work ##### CF based hybrid RS ##### Information network analysis
Alt-Tab at 2019-07-01 09:24:19
Edited by Alt-Tab at 2019-07-01 10:14:46

Please consider to register or login to comment on the paper.